Language-Based Approaches to Data Consistency

Seminar

In distributed systems, data consistency describes whether two copies of the same data that are replicated on different peers are the same. There exist different levels of consistency, which dictate "how much" copies of the same data may differ: Using a strong consistency level, users need to observe the exact same state of the data, no matter which copy they read. Weaker consistency levels on the other hand allow the copies to go out of sync such that one copy still returns an outdated state while the other copy has already changed.

Most distributed applications pick one consistency level for the whole application. However, such a global strategy is often too permissive (leading to concurrency bugs) or too restrictive (slowing down the application and preventing offline availability). Ideally, we would pick just the right strategy for each part of the application; being permissive where we can and being strict where needed.

In recent years, several programming languages have started to make consistency an explicit part of their programming model. This allows compilers and verification tools to make automated decisions about the ideal consistency level for a given application and makes it possible to mix several consistency levels (referred to as hybrid or mixed consistency).

Goal

In this seminar, you will explore several mixed consistency programming languages guided by your own interests. Starting from our pointers, we encourage you to try to discover more references and either explore few languages in-depth or compare more approaches with a broader scope.

Additionally, we encourage you to try one or more languages hands-on, possibly implementing a small example program in one or more of the languages to compare their features.

Starting Points

Replicated Data Consistency Explained Through Baseball, a general introduction to replicated data consistency (not a programming language)
Consistency Analysis in Bloom: a CALM and Collected Approach.
LoRe: A Programming Model for Verifiably Safe Local-first Software
Safe Combination of Data-Centric and Operation-Centric Consistency
Declarative Programming over Eventually Consistent Data Stores

Example Tool

The Hydroflow language
- Paper